22 research outputs found

    Adaptive ResNet Architecture for Distributed Inference in Resource-Constrained IoT Systems

    Full text link
    As deep neural networks continue to expand and become more complex, most edge devices are unable to handle their extensive processing requirements. Therefore, the concept of distributed inference is essential to distribute the neural network among a cluster of nodes. However, distribution may lead to additional energy consumption and dependency among devices that suffer from unstable transmission rates. Unstable transmission rates harm real-time performance of IoT devices causing low latency, high energy usage, and potential failures. Hence, for dynamic systems, it is necessary to have a resilient DNN with an adaptive architecture that can downsize as per the available resources. This paper presents an empirical study that identifies the connections in ResNet that can be dropped without significantly impacting the model's performance to enable distribution in case of resource shortage. Based on the results, a multi-objective optimization problem is formulated to minimize latency and maximize accuracy as per available resources. Our experiments demonstrate that an adaptive ResNet architecture can reduce shared data, energy consumption, and latency throughout the distribution while maintaining high accuracy.Comment: Accepted in the International Wireless Communications & Mobile Computing Conference (IWCMC 2023

    RL-OPRA: Reinforcement Learning for Online and Proactive Resource Allocation of crowdsourced live videos

    Get PDF
    © 2020 Elsevier B.V. With the advancement of rich media generating devices, the proliferation of live Content Providers (CP), and the availability of convenient internet access, crowdsourced live streaming services have witnessed unexpected growth. To ensure a better Quality of Experience (QoE), higher availability, and lower costs, large live streaming CPs are migrating their services to geo-distributed cloud infrastructure. However, because of the dynamics of live broadcasting and the wide geo-distribution of viewers and broadcasters, it is still challenging to satisfy all requests with reasonable resources. To overcome this challenge, we introduce in this paper a prediction driven approach that estimates the potential number of viewers near different cloud sites at the instant of broadcasting. This online and instant prediction of distributed popularity distinguishes our work from previous efforts that provision constant resources or alter their allocation as the popularity of the content changes. Based on the derived predictions, we formulate an Integer-Linear Program (ILP) to proactively and dynamically choose the right data center to allocate exact resources and serve potential viewers, while minimizing the perceived delays. As the optimization is not adequate for online serving, we propose a real-time approach based on Reinforcement Learning (RL), namely RL-OPRA, which adaptively learns to optimize the allocation and serving decisions by interacting with the network environment. Extensive simulation and comparison with the ILP have shown that our RL-based approach is able to present optimal results compared to heuristic-based approaches.This work was supported by the Qatar Foundation

    Zero-touch realization of Pervasive Artificial Intelligence-as-a-service in 6G networks

    Full text link
    The vision of the upcoming 6G technologies, characterized by ultra-dense network, low latency, and fast data rate is to support Pervasive AI (PAI) using zero-touch solutions enabling self-X (e.g., self-configuration, self-monitoring, and self-healing) services. However, the research on 6G is still in its infancy, and only the first steps have been taken to conceptualize its design, investigate its implementation, and plan for use cases. Toward this end, academia and industry communities have gradually shifted from theoretical studies of AI distribution to real-world deployment and standardization. Still, designing an end-to-end framework that systematizes the AI distribution by allowing easier access to the service using a third-party application assisted by a zero-touch service provisioning has not been well explored. In this context, we introduce a novel platform architecture to deploy a zero-touch PAI-as-a-Service (PAIaaS) in 6G networks supported by a blockchain-based smart system. This platform aims to standardize the pervasive AI at all levels of the architecture and unify the interfaces in order to facilitate the service deployment across application and infrastructure domains, relieve the users worries about cost, security, and resource allocation, and at the same time, respect the 6G stringent performance requirements. As a proof of concept, we present a Federated Learning-as-a-service use case where we evaluate the ability of our proposed system to self-optimize and self-adapt to the dynamics of 6G networks in addition to minimizing the users' perceived costs.Comment: IEEE Communications Magazin

    Pervasive AI for IoT applications: A Survey on Resource-efficient Distributed Artificial Intelligence

    Get PDF
    Artificial intelligence (AI) has witnessed a substantial breakthrough in a variety of Internet of Things (IoT) applications and services, spanning from recommendation systems and speech processing applications to robotics control and military surveillance. This is driven by the easier access to sensory data and the enormous scale of pervasive/ubiquitous devices that generate zettabytes of real-time data streams. Designing accurate models using such data streams, to revolutionize the decision-taking process, inaugurates pervasive computing as a worthy paradigm for a better quality-of-life (e.g., smart homes and self-driving cars.). The confluence of pervasive computing and artificial intelligence, namely Pervasive AI, expanded the role of ubiquitous IoT systems from mainly data collection to executing distributed computations with a promising alternative to centralized learning, presenting various challenges, including privacy and latency requirements. In this context, an intelligent resource scheduling should be envisaged among IoT devices (e.g., smartphones, smart vehicles) and infrastructure (e.g., edge nodes and base stations) to avoid communication and computation overheads and ensure maximum performance. In this paper, we conduct a comprehensive survey of the recent techniques and strategies developed to overcome these resource challenges in pervasive AI systems. Specifically, we first present an overview of the pervasive computing, its architecture, and its intersection with artificial intelligence. We then review the background, applications and performance metrics of AI, particularly Deep Learning (DL) and reinforcement learning, running in a ubiquitous system. Next, we provide a deep literature review of communication-efficient techniques, from both algorithmic and system perspectives, of distributed training and inference across the combination of IoT devices, edge devices and cloud servers. Finally, we discuss our future vision and research challenges

    A Survey on Mobile Edge Computing for Video Streaming : Opportunities and Challenges

    Get PDF
    5G communication brings substantial improvements in the quality of service provided to various applications by achieving higher throughput and lower latency. However, interactive multimedia applications (e.g., ultra high definition video conferencing, 3D and multiview video streaming, crowd-sourced video streaming, cloud gaming, virtual and augmented reality) are becoming more ambitious with high volume and low latency video streams putting strict demands on the already congested networks. Mobile Edge Computing (MEC) is an emerging paradigm that extends cloud computing capabilities to the edge of the network i.e., at the base station level. To meet the latency requirements and avoid the end-to-end communication with remote cloud data centers, MEC allows to store and process video content (e.g., caching, transcoding, pre-processing) at the base stations. Both video on demand and live video streaming can utilize MEC to improve existing services and develop novel use cases, such as video analytics, and targeted advertisements. MEC is expected to reshape the future of video streaming by providing ultra-reliable and low latency streaming (e.g., in augmented reality, virtual reality, and autonomous vehicles), pervasive computing (e.g., in real-time video analytics), and blockchain-enabled architecture for secure live streaming. This paper presents a comprehensive survey of recent developments in MEC-enabled video streaming bringing unprecedented improvement to enable novel use cases. A detailed review of the state-of-the-art is presented covering novel caching schemes, optimal computation offloading, cooperative caching and offloading and the use of artificial intelligence (i.e., machine learning, deep learning, and reinforcement learning) in MEC-assisted video streaming services.publishedVersionPeer reviewe

    Architectures réseaux et optimisation d'énergie pour les centres de données massives

    No full text
    The increasing trend to migrate applications, computation and storage into more robust systems leads to the emergence of mega data centers hosting tens of thousands of servers. As a result, designing a data center network that interconnects this massive number of servers, and providing efficient and fault-tolerant routing service are becoming an urgent need and a challenge that will be addressed in this thesis. Since this is a hot research topic, many solutions are proposed like adapting new interconnection technologies and new algorithms for data centers. However, many of these solutions generally suffer from performance problems, or can be quite costly. In addition, devoted efforts have not focused on quality of service and power efficiency on data center networks. So, in order to provide a novel solution that challenges the drawbacks of other researches and involves their advantages, we propose to develop new data center interconnection networks that aim to build a scalable, cost-effective, high performant and QoS-capable networking infrastructure. In addition, we suggest to implement power aware algorithms to make the network energy effective. Hence, we will particularly investigate the following issues: 1) Fixing architectural and topological properties of the new proposed data centers and evaluating their performances and capacities of providing robust systems under a faulty environment. 2) Proposing routing, load-balancing, fault-tolerance and power efficient algorithms to apply on our architectures and examining their complexity and how they satisfy the system requirements. 3) Integrating quality of service. 4) Comparing our proposed data centers and algorithms to existing solutions under a realistic environment. In this thesis, we investigate a quite challenging topic where we intend, first, to study the existing models, propose improvements and suggest new methodologies and algorithms.L’évolution des services en ligne et l’avènement du big data ont favorisé l’introduction de l’internet dans tous les aspects de notre vie : la communication et l’échange des informations (exemple, Gmail et Facebook), la recherche sur le web (exemple, Google), l’achat sur internet (exemple, Amazon) et le streaming vidéo (exemple, YouTube). Tous ces services sont hébergés sur des sites physiques appelés centres de données ou data centers qui sont responsables de stocker, gérer et fournir un accès rapide à toutes les données. Tous les équipements constituants le système d’information d’une entreprise (ordinateurs centraux, serveurs, baies de stockage, équipements réseaux et de télécommunications, etc) peuvent être regroupés dans ces centres de données. Cette évolution informatique et technologique a entrainé une croissance exponentielle des centres de données. Cela pose des problèmes de coût d’installation des équipements, d’énergie, d’émission de chaleur et de performance des services offerts aux clients. Ainsi, l’évolutivité, la performance, le coût, la fiabilité, la consommation d’énergie et la maintenance sont devenus des défis importants pour ces centres de données. Motivée par ces défis, la communauté de recherche a commencé à explorer de nouveaux mécanismes et algorithmes de routage et des nouvelles architectures pour améliorer la qualité de service du centre de données. Dans ce projet de thèse, nous avons développé de nouveaux algorithmes et architectures qui combinent les avantages des solutions proposées, tout en évitant leurs limitations. Les points abordés durant ce projet sont: 1) Proposer de nouvelles topologies, étudier leurs propriétés, leurs performances, ainsi que leurs coûts de construction. 2) Conception des algorithmes de routage et des modèles pour réduire la consommation d’énergie en prenant en considération la complexité, et la tolérance aux pannes. 3) Conception des protocoles et des systèmes de gestion de file d’attente pour fournir une bonne qualité de service. 4) Évaluation des nouveaux systèmes en les comparants à d’autres architectures et modèles dans des environnements réalistes

    A guaranteed performance of a green data center based on the contribution of vital nodes

    No full text
    In order to satisfy the need for the critical computing resources, many data center architectures proposed to house a huge number of network devices. These devices are used to achieve the highest performance in case of full utilization of the network. However, the peak capacity of the network is rarely reached. Consequently, many devices are set into idle state and cause a huge energy waste leading to a non-proportionality between the network load and the energy consumed. In this paper, we propose a power-aware routing algorithm that saves energy consumption with a negligible trade-off on the performance of the network. The idea is to keep active only the source and destination devices and the vital nodes participating in the communication. Vital nodes in the network are calculated only once and can be used with a constant time complexity. Besides its short computation time, our routing algorithm guarantees also a high performance and shows over 50% of energy saving.Scopu

    PTNet: A parameterizable data center network

    No full text
    This paper presents PTNet, a new data center topology that is specifically designed to offer a high and parameterized scalability with just one layer architecture. Furthermore, despite its high scalability, PTNet grants a reduced latency and a high performance in terms of capacity and fault tolerance. Consequently, compared to widely known data center networks, our new topology shows better capacity, robustness, cost-effectiveness and less power consumption. Conducted experiments and theoretical analyses illustrate the performance of the novel system. 2016 IEEE.Scopu
    corecore